188 research outputs found
Regression and Learning to Rank Aggregation for User Engagement Evaluation
User engagement refers to the amount of interaction an instance (e.g., tweet,
news, and forum post) achieves. Ranking the items in social media websites
based on the amount of user participation in them, can be used in different
applications, such as recommender systems. In this paper, we consider a tweet
containing a rating for a movie as an instance and focus on ranking the
instances of each user based on their engagement, i.e., the total number of
retweets and favorites it will gain.
For this task, we define several features which can be extracted from the
meta-data of each tweet. The features are partitioned into three categories:
user-based, movie-based, and tweet-based. We show that in order to obtain good
results, features from all categories should be considered. We exploit
regression and learning to rank methods to rank the tweets and propose to
aggregate the results of regression and learning to rank methods to achieve
better performance. We have run our experiments on an extended version of
MovieTweeting dataset provided by ACM RecSys Challenge 2014. The results show
that learning to rank approach outperforms most of the regression models and
the combination can improve the performance significantly.Comment: In Proceedings of the 2014 ACM Recommender Systems Challenge,
RecSysChallenge '1
Recommended from our members
Neural Models for Information Retrieval without Labeled Data
Recent developments of machine learning models, and in particular deep neural networks, have yielded significant improvements on several computer vision, natural language processing, and speech recognition tasks. Progress with information retrieval (IR) tasks has been slower, however, due to the lack of large-scale training data as well as neural network models specifically designed for effective information retrieval. In this dissertation, we address these two issues by introducing task-specific neural network architectures for a set of IR tasks and proposing novel unsupervised or \emph{weakly supervised} solutions for training the models. The proposed learning solutions do not require labeled training data. Instead, in our weak supervision approach, neural models are trained on a large set of noisy and biased training data obtained from external resources, existing models, or heuristics.
We first introduce relevance-based embedding models that learn distributed representations for words and queries. We show that the learned representations can be effectively employed for a set of IR tasks, including query expansion, pseudo-relevance feedback, and query classification.
We further propose a standalone learning to rank model based on deep neural networks. Our model learns a sparse representation for queries and documents. This enables us to perform efficient retrieval by constructing an inverted index in the learned semantic space. Our model outperforms state-of-the-art retrieval models, while performing as efficiently as term matching retrieval models.
We additionally propose a neural network framework for predicting the performance of a retrieval model for a given query. Inspired by existing query performance prediction models, our framework integrates several information sources, such as retrieval score distribution and term distribution in the top retrieved documents. This leads to state-of-the-art results for the performance prediction task on various standard collections.
We finally bridge the gap between retrieval and recommendation models, as the two key components in most information systems. Search and recommendation often share the same goal: helping people get the information they need at the right time. Therefore, joint modeling and optimization of search engines and recommender systems could potentially benefit both systems. In more detail, we introduce a retrieval model that is trained using user-item interaction (e.g., recommendation data), with no need to query-document relevance information for training.
Our solutions and findings in this dissertation smooth the path towards learning efficient and effective models for various information retrieval and related tasks, especially when large-scale training data is not available
ANTIQUE: A Non-Factoid Question Answering Benchmark
Considering the widespread use of mobile and voice search, answer passage
retrieval for non-factoid questions plays a critical role in modern information
retrieval systems. Despite the importance of the task, the community still
feels the significant lack of large-scale non-factoid question answering
collections with real questions and comprehensive relevance judgments. In this
paper, we develop and release a collection of 2,626 open-domain non-factoid
questions from a diverse set of categories. The dataset, called ANTIQUE,
contains 34,011 manual relevance annotations. The questions were asked by real
users in a community question answering service, i.e., Yahoo! Answers.
Relevance judgments for all the answers to each question were collected through
crowdsourcing. To facilitate further research, we also include a brief analysis
of the data as well as baseline results on both classical and recently
developed neural IR models
Target Apps Selection: Towards a Unified Search Framework for Mobile Devices
With the recent growth of conversational systems and intelligent assistants
such as Apple Siri and Google Assistant, mobile devices are becoming even more
pervasive in our lives. As a consequence, users are getting engaged with the
mobile apps and frequently search for an information need in their apps.
However, users cannot search within their apps through their intelligent
assistants. This requires a unified mobile search framework that identifies the
target app(s) for the user's query, submits the query to the app(s), and
presents the results to the user. In this paper, we take the first step forward
towards developing unified mobile search. In more detail, we introduce and
study the task of target apps selection, which has various potential real-world
applications. To this aim, we analyze attributes of search queries as well as
user behaviors, while searching with different mobile apps. The analyses are
done based on thousands of queries that we collected through crowdsourcing. We
finally study the performance of state-of-the-art retrieval models for this
task and propose two simple yet effective neural models that significantly
outperform the baselines. Our neural approaches are based on learning
high-dimensional representations for mobile apps. Our analyses and experiments
suggest specific future directions in this research area.Comment: To appear at SIGIR 201
- …